The Linear Combination Data Fusion Method in Information Retrieval
نویسندگان
چکیده
In information retrieval, data fusion has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility since different weights can be assigned to different component systems so as to obtain better fusion results. However, how to obtain suitable weights for all the component retrieval systems is still an open problem. In this paper, we use the multiple linear regression technique to obtain optimum weights for all involved component systems. Optimum is in the least squares sense that minimize the difference between the estimated scores of all documents by linear combination and the judged scores of those documents. Our experiments with four groups of runs submitted to TREC show that the linear combination method with such weights steadily outperforms the best component system and other major data fusion methods such as CombSum, CombMNZ, and the linear combination method with performance level/performance square weighting schemas by large margins.
منابع مشابه
Hyperspectral Image Classification Based on the Fusion of the Features Generated by Sparse Representation Methods, Linear and Non-linear Transformations
The ability of recording the high resolution spectral signature of earth surface would be the most important feature of hyperspectral sensors. On the other hand, classification of hyperspectral imagery is known as one of the methods to extracting information from these remote sensing data sources. Despite the high potential of hyperspectral images in the information content point of view, there...
متن کاملImproving high accuracy retrieval by eliminating the uneven correlation effect in data fusion
The aim of this research is twofold. On the one hand, high accuracy retrieval has been a concern of the information retrieval community for some time. We aim to investigate this issue via data fusion. On the other hand, correlation among component results has been proven to be harmful to data fusion, but has not been taken into account in data fusion algorithms. In the hope of achieving better ...
متن کاملA Score Fusion Method Using a Mixture Copula
In this paper, we propose a score fusion method using a mixture copula that can consider complex dependencies between multiple relevance scores in order to improve the effectiveness of information retrieval. The combination of multiple relevance scores has been shown to be effective in comparison with a single score. Widely used score fusion methods are linear combination and learning to rank. ...
متن کاملThe weighted Condorcet fusion in information retrieval
The Condorcet fusion is a distinctive fusion method and was found useful in information retrieval. Two basic requirements for the Condorcet fusion to improve retrieval effectiveness are: (1) all component systems involved should be more or less equally effective; and (2) each information retrieval system should be developed independently and thus each component result is more or less equally di...
متن کاملCombination of Feature Selection and Learning Methods for IoT Data Fusion
In this paper, we propose five data fusion schemes for the Internet of Things (IoT) scenario,which are Relief and Perceptron (Re-P), Relief and Genetic Algorithm Particle Swarm Optimization (Re-GAPSO), Genetic Algorithm and Artificial Neural Network (GA-ANN), Rough and Perceptron (Ro-P)and Rough and GAPSO (Ro-GAPSO). All the schemes consist of four stages, including preprocessingthe data set ba...
متن کامل